A hidden Markov model-based approach for emotional speech synthesis
نویسندگان
چکیده
In this paper, we describe an approach to automatically synthesize the emotional speech of a target speaker based on the hidden Markov model for his/her neutral speech. The basic idea is the model interpolation between the neutral model of the target speaker and an emotional model selected from a candidate pool. Both the interpolation model selection and the interpolation weight computation are determined based on a modeldistance measure. In this paper, we propose a monophonebased Mahalanobis distance (MBMD). We evaluate our approach on the synthesized emotional speech of angriness, happiness, and sadness with several subjective tests. Experimental results show that the implemented system is able to synthesize speech with emotional expressiveness of the target speaker.
منابع مشابه
A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM
This paper describes a technique for modeling and controlling emotional expressivity of speech in HMM-based speech synthesis. A problem of conventional emotional speech synthesis based on HMM is that the intensity of an emotional expression appearing in synthetic speech completely depends on the database used for model training. To take into account the emotional expressivity that listeners act...
متن کاملEvaluation of Hidden Semi-Markov Models Training Methods for Greek Emotional Text-to-Speech Synthesis
This paper describes and evaluates four different HSMM (hidden semi-Markov model) training methods for HMM-based synthesis of emotional speech. The first method, called emotion-dependent modelling, uses individual models trained for each emotion separately. In the second method, emotion adaptation modelling, at first a model is trained using neutral speech, and thereafter adaptation is performe...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملStyle estimation of speech based on multiple regression hidden semi-Markov model
This paper presents a technique for estimating the degree or intensity of emotional expressions and speaking styles appeared in speech. The key idea is based on a style control technique for speech synthesis using multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse process of the style control. We derive an algorithm for estimating pred...
متن کاملA style control technique for speech synthesis using multiple regression HSMM
This paper presents a technique for controlling intuitively the degree or intensity of speaking styles and emotional expressions of synthetic speech. The conventional style control technique based on multiple regression HMM (MRHMM) has a problem that it is difficult to control phone duration of synthetic speech because HMM has no explicit parameter which models phone duration appropriately. To ...
متن کامل